Search results for " outlier"
showing 10 items of 11 documents
TERMITE: AnRscript for fast reduction of laser ablation inductively coupled plasma mass spectrometry data and its application to trace element measur…
2017
RATIONALE High spatial resolution Laser Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICPMS) determination of trace element concentrations is of great interest for geological and environmental studies. Data reduction is a very important aspect of LA-ICP-MS, and several commercial programs for handling LA-ICPMS trace element data are available. Each of these software packages has its specific advantages and disadvantages. METHODS Here we present TERMITE, an R script for the reduction of LA-ICPMS data, which can reduce both spot and line scan measurements. Several parameters can be adjusted by the user, who does not necessarily need prior knowledge in R. Currently, ten reference m…
Influence Diagnostics for Meta-Analysis of Individual Patient Data Using Generalized Linear Mixed Models
2014
In meta-analysis, generalized linear mixed models (GLMMs) are usually used when heterogeneity is present and individual patient data (IPD) are available, while accepting binary, discrete as well as continuous response variables. In the present paper some measures of influence diagnostics based on log-likelihood are suggested and discussed. A known measure is approximated to get a simpler form, for which the information matrix is no more necessary. The performance of the proposed measure is assessed through a diagnostic analysis on simulated data reproducing a possible meta-analytical context of IPD with influential outliers. The proposed measure is showed to work well and to have a form sim…
Influence diagnostics for generalized linear mixed models: a gradient-like statistic
2013
In the literature, many influence measures proposed for Generalized Linear Mixed Models (GLMMs) require the information matrix that can be difficult to calculate. In the present paper, a known influence measure is approximated to get a simpler form, for which the information matrix is no more necessary. The proposed measure is showed to have a form similar to the gradient statistic, recently introduced. Good performances have been obtained through simulation studies.
RAD SNP markers as a tool for conservation of dolphinfish Coryphaena hippurus in the Mediterranean Sea: Identification of subtle genetic structure an…
2016
Dolphinfish is an important fish species for both commercial and sport fishing, but so far limited information is available on genetic variability and pattern of differentiation of dolphinfish populations in the Mediterranean basin. Recently developed techniques allow genome-wide identification of genetic markers for better understanding of population structure in species with limited genome information. Using restriction-site associated DNA analysis we successfully genotyped 140 individuals of dolphinfish from eight locations in the Mediterranean Sea at 3324 SNP loci. We identified 311 sex-related loci that were used to assess sex-ratio in dolphinfish populations. In addition, we identifie…
Improving point matching on multimodal images using distance and orientation automatic filtering
2016
International audience; Speed Up Robust Features SURF is one of the most popular and efficient methods used for image registration task. In order to achieve a correct registration, a good matching of feature point is required. However in the case of multimodal images, the high and non-linear intensity changes between different modalities led to many outliers (mismatching of detected points) and consequently a fail in the registration. Therefore, in this paper we introduce an efficient method devoted to the detection and removal of such outlier. It's based on an automatic filtering of outliers on both distance and orientation between features points. We tested our proposed method on a set of…
Diagnostics for meta-analysis based on generalized linear mixed models
2012
Meta-analysis is the method to combine data coming from multiple studies, with the aim to provide an overall event-risk measure of interest summarizing information coming from the studies. In meta-analysis generalized linear mixed models (GLMM) are particularly used for a number of measures of interest since they allow the true effect size to differ from study to study while accepting binary, discrete as well as continuous response variable. In the present paper some strategies of influence diagnostics based on log-likelihood are suggested and discussed. These are considered for Individual Patient Data, Aggregate Data and their compounding.
Outlier detection to hierarchical and mixed effects models
2008
Hierarchical and mixed effects models are models where a varying number of coefficients may be random at different levels of the hierarchy. The purpose of outlier analysis for these models is to determine whether an outlying unit at higher level is entirely outlying, or outlying due to effect of one or a few aberrant lower level units. Most works on diagnostics for these complex models have focused on the mixed model rather than on the hierarchical models, obscuring some relevant aspects of the hierarchical model. In this paper we will present an approach to influence analysis and outlier detection for mixed and hierarchical model, focusing on the special structure of nested data that these…
Iteratively reweighted least squares in crystal structure refinements
2011
The use of robust techniques in crystal structure multipole refinements of small molecules as an alternative to the commonly adopted weighted least squares is presented and discussed. As is well known, the main disadvantage of least-squares fitting is its sensitivity to outliers. The elimination from the data set of the most aberrant reflections (due to both experimental errors and incompleteness of the model) is an effective practice that could yield satisfactory results, but it is often complicated in the presence of a great number of bad data points, whose one-by-one elimination could become unattainable. This problem can be circumvented by means of a robust least-squares regression that…
A gradient-based deletion diagnostic measure for generalized linear mixed models
2016
ABSTRACTA gradient-statistic-based diagnostic measure is developed in the context of the generalized linear mixed models. Its performance is assessed by some real examples and simulation studies, in terms of ability in detecting influential data structures and of concordance with the most used influence measures.
A new method to "clean up" ultra high-frequency data
2007
In the applied econometrics, the availability of ultra high-frequency databases is having an important impact on the research market microstructure theory. The ultra high-frequency databases contain detailed reports of all the financial market activity information which is available. However, ultra high-frequency databases cannot be directly used. On one hand recording mistakes can be present, on the other hand missing information has to be inferred from the available data. In this paper, we propose a simple method in order to clean up the ultra high-frequency data from possible errors and we examine the method efficacy when we analyze data by using an autoregressive conditional duration (A…